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ABSTRACT 


Researchers have highlighted how tracking learners’ eye-gaze can 
reveal their reading behaviors and strategies, and this provides a 
framework for developing personalized feedback to improve learn- 
ing and problem solving skills. In this paper, we describe analyses 
of eye-gaze data collected from 16 middle school students who 
worked with Betty’s Brain, an open-ended learning environment, 
where students learn science by building causal models to teach a 
virtual agent. Our goal was to test whether newly available con- 
sumer-level eye trackers could provide the data that would allow us 
to probe further into the relations between students’ reading of hy- 
pertext resources and building of graphical causal maps. We col- 
lected substantial amounts of gaze data and then constructed clas- 
sifier models to predict whether students would be successful in 
constructing correct causal links. These models predicted correct 
map-building actions with an accuracy of 80% (Fl = 0.82; Cohen’s 
kappa « = 0.62). The proportions of correct link additions are in 
turn directly related to learners’ performance in Betty's Brain. 
Therefore, students’ gaze patterns when reading the resources may 
be good indicators of their overall performance. These findings can 
be used to support the development of a real-time eye gaze analysis 
system, which can detect students reading patterns, and when nec- 
essary provide support to help them become better readers. 


Keywords 


Eye-Gaze Data Analysis; Computer-Based Learning Environment; 
Reading Behavior; Classification. 


1. INTRODUCTION 


In a number of computer-based learning environments (CBLEs), 
students are expected to learn and refresh their domain knowledge 
from resources (typically in text or hypertext form with figures), 
then to construct solutions to assigned problems based on their 
learned knowledge. Such environments are known to help students 
develop cognitive skills and strategic reasoning processes, and, 
therefore, help students not only learn the domain content but pre- 
pare them for future learning [2, 3, 5, 17, 30-32]. However, because 
of the open-ended nature of these environments, novice learners of- 
ten have difficulties in making progress toward their goals and 
completing their solutions. Therefore, the ability to track and un- 
derstand learners’ performance and behaviors is important for their 
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overall success, so that relevant personalized feedback and instruc- 
tion can be provided to them as necessary. However, tracking stu- 
dents’ reading behaviors with sufficient precision and accuracy in 
computer-based learning environments is a non-trivial task. 


Use of technologies, such as eye tracking devices can provide be- 
havioral metrics that researchers can use to study learners basic 
cognitive processes and other information processing skills during 
reading [12, 27, 28, 35]. For educational research and applications, 
use of eye-tracking data has mainly focused on studying the effects 
of instructional strategies on eye-gaze behavior [21]. Some of these 
studies focus on learning how students’ spatial contiguity [16], at- 
tention level [23] and viewing behavior [1] affect the cognitive pro- 
cesses that mediate learning outcomes. Conati et al. [7] have re- 
viewed previous studies that modeled students’ cognitive, metacog- 
nitive and affective states in intelligent learning environments using 
eye-gaze data. For example, Bondareva, et al. [4] assessed student 
learning from eye-gaze data during interaction with MetaTutor, an 
intelligent CBLE designed to develop self-regulated learning skills 
when generating summaries after reading about complex science 
topics. The MetaTutor study reported 78% classification accuracy 
on student learning based on the features extracted by gaze data 
alone. Similar results were reported by Kardan and Conati [18], in 
modeling students’ learning with interactive simulations. 


Peterson, et al. [25] report that learners’ eye-gaze and pupil dilation 
data were used to predict performance and learning gains in Chem- 
Tutor, designed to teach chemistry. Hutt, et al. [14] studied stu- 
dents’ mind wandering using eye-gaze on specific areas of interest 
(AOD [10]. All of these results show that eye-tracking devices help 
to track learners’ reading behaviors in CBLEs. Most of this research 
has relied upon expensive research-grade eye-tracking devices ap- 
propriate primarily for lab settings. However, newly available con- 
sumer-level eye-trackers are relatively inexpensive and have re- 
cently been deployed in classroom environments [14]. Our goal in 
this study is to run an initial proof of concept case study to demon- 
strate that these consumer-grade eye-tracking devices with sam- 
pling rates less than 90 Hz can effectively predict learners’ behav- 
iors in CBLEs. 
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In the research reviewed above [1, 4, 7, 16, 18, 22, 23], eye gaze 
features were extracted using global gaze features computed across 
broad Areas of Interest (AOI) that do not differentiate between 
more fine-grained screen contents. For example, the features ex- 
tracted in [4] are based on predefined window position in the learn- 
ing environment. This can be a limiting factor in CBLEs, where 
students are expected to learn by combining information from mul- 
tiple hypertext resources. In Betty's Brain, a CBLE developed by 
our group [3, 24], students build a causal map to teach their agent, 
using hypertext resources that span multiple pages. Students are ex- 
pected to find, read, and interpret sentences that provide infor- 
mation about entities and causal relations between entities, and add 
the link(s) to the current causal model. Extracting students’ eye- 
gaze features as they read these hypertext resources would require 
a different AOI for each hypertext resource page. To address this 
challenge we propose a methodology to extract eye-gaze features 
that are directly related to content in each of the hypertext resource 


pages. 


The proposed methodology was applied to eye-gaze data collected 
from middle school students who worked on Betty’s Brain learning 
environment. The features extracted from the eye-gaze data were 
then used to construct classifier models that predict learners’ model 
building effectiveness given their reading characteristics. For our 
study, we were able to predict learner performance in causal map 
building with an accuracy of 80% (F1 = 0.82; Cohen’s kappa «k = 
0.62). The learned classifier model was then used to classify learn- 
ers reading behavior and directly related to learners’ performance 
on map building action in Betty's Brain. These findings can be used 
to support the development of a real-time eye-gaze analysis system 
to provide personalized feedback and adaptive instructions. 


The rest of the paper is organized as follows. Section 2 describes 
the learning environment. Section 3 describes the proposed meth- 
odology to extract content based eye-gaze features from learning 
environment with multiple hypertext resources. Section 4 describes 
the experimental design, data collection, methodology to prepro- 
cess the data and train the classifiers to predict learning based on 
features extracted solely from eye-gaze data. The results are reports 
in section 5. Conclusions, limitations and future work are discussed 
in section 6. 


2. BACKGROUND: THE BETTY’S BRAIN 
LEARNING ENVIRONMENT 


The Betty’s Brain learning environment [24] assigns learners the 
task of teaching a science topic to a teachable agent named Betty 
by constructing a visual causal map consisting of a set of entities 
connected by directed causal links. As students build their map, 
they can ask Betty questions, and can answer them and explain her 
answers. The students’ goal is to teach Betty a causal map that 
matches a hidden expert model of the topic. 


Students’ activities are categorized into three primary action types: 
(1) reading hypertext resources on the science topic (READ), (2) 
building the causal map (BUILD), and (3) assessing (ASSESS) the 
correctness of the map [8]. Students iterate among these activities 
until they have taught Betty a correct model. In this paper, we study 
learners’ information acquisition processes primarily as reading the 
hypertext resources that describe the science topic under study 
(e.g., human causes and effects of climate change) by breaking it 
down into a set of subtopics. Each sub-topic describes a system or 
a process (e.g., the greenhouse effect) in terms of entities (e.g., ab- 
sorbed heat energy) and causal relations among these entities (ab- 
sorbed heat energy increases the average global temperature). As 


students read about the topic, they extract the causal relations be- 
tween entities and construct the causal map to teach Betty. Figures 
1 illustrates the Betty’s Brain READ (set of hypertext resources) 
and BUILD interfaces. 
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(b) 
Figure 1. Betty’s Brain system showing (a) READ (Science re- 
sources) and (b) BUILD (Causal Map) Interfaces 


Students can assess their own understanding and success in teach- 

ing Betty by: 

1. Querying Betty using a template for asking cause-effect ques- 
tions. A second pedagogical “mentor” agent, Mr. Davis, helps 
grade Betty’s answers by comparing them against the expert 
model. 

2. Asking Betty to take a quiz, which helps them evaluate the 
current state of the map. 


In addition to the three major actions (READ, BUILD, and AS- 
SESS), students can also take NOTES on information from the sci- 
ence book, and CONVERSE with Betty or Mr. Davis. Students’ in- 
teractions with the environment are recorded, in log files with as- 
sociated timestamps. 


Student performance in the Betty’s Brain environment is measured 
by their current “map score”, which is computed as the difference 
between the number of correct and incorrect links present in the 
student’s map at any point of time. Depending on the edit actions 
performed by the student, map score can increase, decrease, or re- 
main the same. Map score patterns vary among students and display 
their individual learning behaviors. 
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Students’ learning behaviors in Betty’s Brain are modeled accord- 
ing to a cognitive/metacognitive task model [19]. Their interactions 
with the system are mapped to particular skills (for example, read- 
ing hypertext resources is mapped to an information acquisition 
skill), which are then interpreted in terms of the overall learning 
objectives. A sequential combination of skills, performed in a con- 
text, is interpreted as a problem solving strategy. Researchers have 
employed a combination of analytics methods [34] and exploratory 
sequence mining techniques for detecting and characterizing stu- 
dents’ metacognitive processes [20] in the Betty’s Brain environ- 
ment. Betty’s Brain has been shown to significantly improve stu- 
dent learning, as measured by gains observed from pre- to post- 
tests. [9, 19, 20, 24, 34]. 


An important component that governs students’ learning and causal 
reasoning processes in Betty’s Brain is their ability to interpret the 
information provided in the hypertext resources and convert it into 
efficient causal links. However, this information extraction and in- 
terpretation procedure cannot be captured completely from our log 
files. The use of eye tracking devices can help us track the reading 
behaviors of students and provide more insight into this procedure. 
Hence, our goal in this work is to use eye tracking devices in class- 
rooms to better understand students’ learning behaviors as they in- 
teract with Betty’s Brain in authentic settings. In the next section, 
we describe our proposed methodology to extract eye-gaze features 
that are directly related to content in each of the hypertext re- 
sources. 


3. METHODOLOGY TO EXTRACT EYE- 
GAZE FEATURES 


The steps involved in extracting content based eye-gaze features 
from hypertext resources in an open-ended learning environment 
are shown in Figure 2. In order to extract features, we first align the 
log data (in Figure 2(a)) from the learning environment and raw 
data (b) from the eye-tracking device. Then the Area of Interest 
(AOD from each section of the hypertext resources (key file) are 
aligned, and used to extract the content based eye-gaze features. 
The details of log data and the key file are described below. 


Students’ interactions with the learning environment are stored 
with timestamps, in log files. This includes all student activities 
such as Read, Build, Notes, and Assess actions. To extract the con- 
tent based eye-gaze features, we define the bounding box coordi- 
nates [x, y] of three AOI regions: a) the title, b) the image c) the 
sentence that explains the causal relationship between entities. The 
AOI positions vary for each resource page, hence a key file is cre- 
ated with start and end positions of AOI region of each hypertext 
resource in the learning environment. Table 1 shows a sample key 
file with details of AOIs for a science resource page “Solar Energy 
and Absorbed Light” [33]. The sentence “The more solar energy 
that the Earth receives, the more light energy it will absorb.” de- 
scribes the causal relationship between the two entities “Solar en- 
ergy” and “Absorbed light energy” that is relevant for the causal 
model. The [x, y] coordinates of starting position and ending posi- 
tion of the AOIs are identified, for a display with screen resolution 
of 1600*900, and recorded in the key file. 


The raw data from the eye-tracking device contains eye-gaze posi- 
tion on the display represented as [x, y] coordinates with the 
timestamp for each sample. The number of samples per second are 
based on the sampling rate of the eye-tracking device. The 
timestamp in the log data and raw data from eye-tracking are used 
to align and combine them for further analyses. Using the aligned 


data and position of AOIs from the key file, the eye-gaze infor- 
mation on AOIs is extracted and then used to extract content based 
eye-gaze features. Eye movements while reading are measured by 
fixations (duration of gaze focused on the same point) and saccades 
(movement of gaze between two fixations) [27, 28]. In this study, 
we used four frequently used [15, 29] measures of fixation, and two 
frequently used measures based on saccades as the features as sum- 
marized in Table 2. The features are computed for each of the three 
AOlIs discussed above and also for the total page, thus providing a 
minimum of 4 x 4 = 16 content-based eye-gaze features for each 
hypertext resource page. Some of the hypertext resources contained 
multiple sentences that explain the causal relationship between en- 
tities. 


Figure 2: Algorithm to Extract Content Based Eye-Gaze Fea- 
tures from Multiple Hypertext Resources 


Table 1: Sample Key file with AOIs for a resource page 


AOI Starting posi- Ending position in 
tionin [x, y] [x, y] coordinates 
coordinates 

Image [415,350] [810,640] 

Title [417,120] [734, 145] 

Causal Rela- [416, 281] [1330,305] 

tion 

Entities Solar energy Absorbed light en- 

ergy 


Causal Rela- 
tionship be- 
tween entities 


The more solar energy that the Earth 
receives, the more light energy it will 
absorb. 


4. EXPERIMENTAL METHODOLOGY 


The analysis presented in this paper is based on a recent study of 
Betty’s Brain. The data was collected from eighteen 6th grade stu- 
dents from two classrooms of a middle school in Nashville, Ten- 
nessee, USA. 


Students used the Betty’s Brain system to learn about the causes 
and effects of climate change. The students’ goal was to develop a 
causal map containing 22 concepts and 25 links representing the 
greenhouse effect (e.g. solar energy, absorbed light energy), hu- 
man activities affecting global climate change (e.g. deforestation, 
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vehicle use), and impacts on climate (e.g. sea ice, ocean level, 
drought). The hypertext resources were organized into one intro- 
ductory page, three pages covering the greenhouse effect, four 
pages covering human activities, and two pages covering impacts 
on climate. Additionally, a glossary section provided a description 
of some of the concepts, one per page. The complete resources were 
made up of 31 hypertext pages.! 


Table 2: Description of eye-gaze features 


a 


Fixation Count Total number of fixations counted 


in a page 


Average Fixation Du- | Mean of fixation duration on a page 
ration in milliseconds | (i.e., Gaze duration mean) 


Fixations Count on Total number of fixations counted 
AOI in an AOI 


Average Fixation Du- | Mean of fixation duration on AOI 
ration on AOI 


The relative angle between two 
consecutive saccades. 


Relative Saccade an- 
gle in degrees 


The size of the saccade measured in 
degrees or mins of arc 


Saccade Amplitude 


4.1 Study Procedure 

The study was conducted over seven school days, with students 
participating in the study for one 60-minute class period each day. 
On day 1, students completed the pretest. On day 2, students 
worked with Betty’s Brain introduction topic to get hands-on train- 
ing on how to identify causal relation with reading text passages. 
During the second day, we also trained the students on how to cal- 
ibrate the eye tracker and helped them to create their eye-tracking 
profile on the laptop. In this study, we used nine Tobii 4c eye-track- 
ing device to collect students’ eye-gaze data. The eye trackers were 
attached to the laptop computer just below the screen using mag- 
netic strips. Students calibrated using the inbuilt Tobii Eye Track- 
ing software? that displays on-screen instructions followed by a six 
point calibration sequence, where the points appear on the screen 
and disappear when students fixated on each point. Students 
worked on Betty’s Brain climate change topic for four class periods 
(day 3-6). During these periods, students first selected their eye- 
tracking profile and calibrated their gaze points using nine-point 
calibration without the help of researchers. On the last day, students 
completed the post-test that was identical to the pre-test. 


4.2 Data Collection 


To extract content based eye-gaze features we combined data from 
the Tobii 4c eye-tracking devices with log data from Betty’s Brain 
system as they worked on the Climate change topic on days 3-6 of 
the study. 


' The Betty’s Brain system can be downloaded from 
https://wp0. vanderbilt.edu/oele/software/ 


? The Tobii Eye Tracking software was downloaded 
from https://tobiigaming.com/getstarted/ 


4.3 Validation of Eye-Tracking Data 

Researchers helped the students to set up and calibrate the eye- 
tracking device during the training day (second session) for a total 
of 18 students. However, we are not able to use the data from two 
students’ due to continuous calibration failure; hence we used the 
eye-gaze data collected from 16 students’ in this analysis. 


On an average, eye gaze data were obtained for 53.3% of the entire 
duration that each student interacted with the learning environment. 
The reason for the loss of data can be attributed to students’ a) focus 
on the keyboard while taking notes and typing labels for keywords, 
b) interaction with other students and c) focus on the teacher or re- 
searcher during instructions. To assess the degree to which the pro- 
portion of data collected was caused by stable individual differ- 
ences between students; we correlated the average proportion of 
data collected over days 1 & 3 for each student with the average 
duration of data collected over days 2 & 4. This correlation was 
very strong (r = 0.89), demonstrating that factors causing varia- 
tion in the amount of data collected for each student were strongly 
affected by individual differences between students. However, 
given the noisy classroom environment, the overall amount of eye- 
gaze data collected for 16 of the 18 students was a promising sign 
that consumer-level eye trackers could be useful in this setting. 


4.4 Data Analysis and Methodology 

We processed the eye-gaze data using pygaze analyzer, an open- 
source toolbox for eye-tracking [8] to extract fixation and saccades. 
The key file, as shown in table 1, is developed based on AOIs in 
ten hypertext resources. Eye-gaze features as described in table 2 
are extracted using the data collected from 16 students. 


To predict learners’ performance in the map building activity using 
only the eye gaze data, we considered the map-building activities 
(ADD, EDIT and DELETE causal links) that were immediately fol- 
lowed by a supported’ by hypertext Read actions [34]. The research 
methodology to model learners’ performance using eye-gaze data 
on hypertext resources during Read action is shown in Figure 3. 
The eye-gaze features extracted during each Read action, and per- 
formance on the subsequent supported Build actions were used as 
a labeled data to train and validate the classifier. The trained clas- 
sifier was then applied on eye-gaze features extracted during all 
Read actions to classify the learner’s reading behavior on hypertext 
resources as effective or ineffective. The average number of effec- 
tive and ineffective Read actions over a session were then used to 
model learners’ performance on causal map building actions in the 
same session. 


5. RESULTS 


In this section, we first describe the results of eye-gaze feature ex- 
traction and performance of the classifier trained using labeled data. 
Then the analysis of modeling learners’ performance using reading 
behavior is discussed. 


3 The two sequential actions Read — Build, is considered sup- 
ported, only if the information acquired in Read action is used in 
the Build action. 
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Figure 3: Research Methodology to Predict Learning from 
learner’s Reading Behavior 


We extracted eye-gaze features during 160 Read actions that were 
immediately followed and supported by Build actions from 16 stu- 
dent’s log and eye-tracking data. Out of 160 eye-gaze features, 36 
(22%) were removed due to insufficient eye-gaze data (total dura- 
tion of eye-gaze on page < 1 millisecond). Of the remaining 124 
eye-gaze features collected during Read actions, 104 Build actions 
were correct, resulting in an increased map score, and only 20 edit 
actions resulted in a decrease in performance. In order to develop a 
classifier model using this imbalanced dataset we used Synthetic 
Minority Over-sampling Technique (SMOTE) algorithm [6], to up- 
sample the minority data (incorrect edits). SMOTE is used to avoid 
overfitting when replicating the minor samples during up-sampling. 
In SMOTE, a subset of data is taken from the minority class to cre- 
ate a synthetic similar instances which are then added to the original 
dataset. 


We used the Gradient tree boosting algorithm [11] for predicting 
map edit action. In this algorithm, many classification models are 
trained sequentially, and the loss function of each model is mini- 
mized using a gradient descent method. In this analysis, we used 
decision trees as the classification model for gradient boosting. We 
used Rapidminer [13] for implementing upsampling and Gradient 
tree boosting. The classification results using 10 fold cross-valida- 
tion are shown in Table 3. 


The gradient tree boosting algorithm predicted the correctness of 
map edit action with an accuracy of 80.83%, Cohen’s kappa « = 
0.62, and Fl Score = 0.82. 


Table 3: Predicting Performance on Map Edit Actions. 


Actual 
Predicted Map Edit (+) | Map Edit (-) | Class Preci- 
sion 
Map Edit (+) 79 15 84.04% 
Map Edit (-) 25 89 78.07% 
Class Recall 75.96% 85.58% 


The trained gradient tree boosting classifier was then used to clas- 
sify learners’ reading behavior as effective or ineffective using eye- 
gaze data during from all of the Read actions. We extracted 1987 
eye-gaze features during Read actions of all students. Out of 1987 


Reading behaviors extracted, 329 (16.5%) were classified as inef- 
fective and rest were classified as effective. Without applying any 
up sampling technique, for each student, we computed the number 
of effective and ineffective read actions per session. To model 
learners’ performance in map building actions using their reading 
behavior on hypertext resources, we used a linear regression with 
the net change in map scores per session as a dependent variable. 
The regression statistics are described in Table 4. 


Table 4: Regression Statistics 


Multiple R 0.515 


R Square 0.262 


Adjusted R Square | 0.229 


Standard Error 3.675 


Observations 49 


Leamer’s performance in the map building task could be predicted 
from a number of effective and effective Read actions by using the 
following formula: 


Performance = 0.17 * # of effective page Read actions + 0.21 * # 
of Ineffective Page Read actions - 1.46; R= 0.51. 


The correlation value, R, indicates a moderate degree of correlation 
between the independent variable (Number of effective and inef- 
fective read actions) and dependent value (Performance in the map 
building actions). 


The results of classifier models trained using the imbalanced data 
show that prediction of learners’ performance for each link-creation 
event, only using content-based eye-gaze features, was signifi- 
cantly greater than chance (Kappa score k = 0.62, and Fl Score = 
0.82). The results of the linear regression model indicate the ability 
to predict learner’s performance on map building tasks based on 
their reading behaviors observed during Read actions. 


6. CONCLUSIONS, LIMITATIONS AND 
FUTURE WORK 


Our goals in this research were threefold: (1) to test the effective- 
ness of using consumer-level eye-tracking devices in a noisy class- 
room environment; (2) to extract the content level eye-gaze features 
during learners reading hypertext resources in the learning environ- 
ment; and (3) to predict the learner's performance based on their 
reading behavior. In this study, we collected eye-gaze data from 16 
middle school student while working on Betty's Brain learning en- 
vironment in a noisy classroom environment. We proposed a meth- 
odology to extract content level eye-gaze features and applied it to 
the data collected from our study. The extracted features were able 
to predict learner's performance in map building task with an Fl 
score of 0.82. These results show the ability to track and predict 
learner’s performance that can be used to provide real-time feed- 
back and adaptive instructions to them. 


The present study has two limitations. First, we were able to extract 
only 124 eye-gaze features during the reading task to train the clas- 
sifier to predict learning. Also, the eye-gaze features extracted were 
imbalanced necessitating use of an upsampling technique to train 
and validate the classifier. Second, we were able to collect eye- 
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tracking data only for 54% of the entire duration that student's in- 
teraction with the learning environment in the real classroom set- 
ting due to the unstructured nature of the environment. 


In addition to collecting more data in our future studies, we propose 
to analyze students’ learning behaviors not only from their reading 
behaviors, but also from learner's other interactions with the sys- 
tem, such as analyzing the quiz answers and interactions with the 
two virtual agents in the system -- the Mentor, Mr. Davis, and the 
Teachable Agent, Betty. The goal is to derive more precise infor- 
mation of the coherence relations between actions (see [34]). We 
also propose to implement real-time eye-gaze analysis to provide 
personalized feedback based on learner’s reading behavior. 
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